Logistic Regression Classification for Uncertain Data

نویسنده

  • Abdallah Bashir Musa
چکیده

Logistic regression (LR) is a famous classification technique commonly used in statistics, machine learning, and data mining area of knowledge for learning a response of binary nature. It assumes that the data values are pre-determined precisely, but this is not true for all conditions. Uncertainty data arises in many applications because of data collection methodology as in repeated measures, outdated sources and imprecise measurement as in physical experiments. Studying this uncertainty data becomes area of interest for researchers nowadays. In uncertainty, the value of data item is mostly characterized by a multiple values. So, machine learning techniques are also required to manage an uncertain data. This paper studies the modification of LR technique to handle data with an uncertainty. Statistical inference and theory of probabilities are used to obtain single unbiased estimator that represents the multiple values sufficiently and efficiently. The Maximum Likelihood Estimators (MLE) and the Probabilities Density Function (PDF) are used to capture the uncertainty. Results of the Experiments on UCI data sets demonstrated that the uncertain LR classifier can be constructed successfully, and its accuracy can be improved by taking into consideration the uncertainty information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

Background: Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predi...

متن کامل

‎A Bayesian mixture model‎ for classification of certain and uncertain data

‎There are different types of classification methods for classifying the certain data‎. ‎All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data‎. ‎In recent years‎, ‎by assuming the distribution of the uncertain data is normal‎, ‎there are several estimation for the mean and variance of this distribution‎. ‎In this paper‎, ‎we co...

متن کامل

Comparing the Results of Logistic Regression Model and Classification and Regression Tree Analysis in Determining Prognostic Factors for Coronary Artery Disease in Mashhad, Iran

Background and purpose: Understanding of the risk factors for cardiovascular artery disease, which is the leading cause of death worldwide, can lead to essential changes in its etiology, prevalence, and treatment. The aim of this study was to compare the results of logistic regression model and Classification and Regression Tree Analysis (CART) in determining the prognostic factors for coronary...

متن کامل

Interval network data envelopment analysis model for classification of investment companies in the presence of uncertain data

The main purpose of this paper is to propose an approach for performance measurement, classification and ranking the investment companies (ICs) by considering internal structure and uncertainty. In order to reach this goal, the interval network data envelopment analysis (INDEA) models are extended. This model is capable to model two-stage efficiency with intermediate measures i...

متن کامل

The Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution

This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014